Stochastic Convex Optimization

نویسندگان

  • Shai Shalev-Shwartz
  • Ohad Shamir
  • Nathan Srebro
  • Karthik Sridharan
چکیده

For supervised classification problems, it is well known that learnability is equivalent to uniform convergence of the empirical risks and thus to learnability by empirical minimization. Inspired by recent regret bounds for online convex optimization, we study stochastic convex optimization, and uncover a surprisingly different situation in the more general setting: although the stochastic convex optimization problem is learnable (e.g. using online-to-batch conversions), no uniform convergence holds in the general case, and empirical minimization might fail. Rather then being a difference between online methods and a global minimization approach, we show that the key ingredient is strong convexity and regularization. Our results demonstrate that the celebrated theorem of Alon et al on the equivalence of learnability and uniform convergence does not extend to Vapnik’s General Setting of Learning, that in the General Setting considering only empirical minimization is not enough, and that despite Vanpnik’s result on the equivalence of strict consistency and uniform convergence, uniform convergence is only a sufficient, but not necessary, condition for meaningful non-trivial learnability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disciplined Convex Stochastic Programming: A New Framework for Stochastic Optimization

We introduce disciplined convex stochastic programming (DCSP), a modeling framework that can significantly lower the barrier for modelers to specify and solve convex stochastic optimization problems, by allowing modelers to naturally express a wide variety of convex stochastic programs in a manner that reflects their underlying mathematical representation. DCSP allows modelers to express expect...

متن کامل

Stochastic Successive Convex Approximation for Non-Convex Constrained Stochastic Optimization

This paper proposes a constrained stochastic successive convex approximation (CSSCA) algorithm to find a stationary point for a general non-convex stochastic optimization problem, whose objective and constraint functions are nonconvex and involve expectations over random states. The existing methods for non-convex stochastic optimization, such as the stochastic (average) gradient and stochastic...

متن کامل

Proximal and First-Order Methods for Convex Optimization

We describe the proximal method for minimization of convex functions. We review classical results, recent extensions, and interpretations of the proximal method that work in online and stochastic optimization settings.

متن کامل

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

We give a novel algorithm for stochastic strongly-convex optimization in the gradient oracle model which returns an O( 1 T )-approximate solution after T gradient updates. This rate of convergence is optimal in the gradient oracle model. This improves upon the previously known best rate of O( log(T ) T ), which was obtained by applying an online strongly-convex optimization algorithm with regre...

متن کامل

Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence

In this paper, we study stochastic non-convex optimization with non-convex random functions. Recent studies on non-convex optimization revolve around establishing second-order convergence, i.e., converging to a nearly second-order optimal stationary points. However, existing results on stochastic non-convex optimization are limited, especially with a high probability second-order convergence. W...

متن کامل

Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization

We give novel algorithms for stochastic strongly-convex optimization in the gradient oracle model which return a O( 1 T )-approximate solution after T iterations. The first algorithm is deterministic, and achieves this rate via gradient updates and historical averaging. The second algorithm is randomized, and is based on pure gradient steps with a random step size. This rate of convergence is o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009